---
title: "Data documentation and codebook: Ask the experts? A Delphi survey of immigration to the European Union in 2030"
author: "Authors: Sohst, R., Acostamadiedo, E., Tjaden, J. & de Valk, H."
output:
  html_document:
    css: scripts/css_style/style.css
    number_section: true
    toc: true
    toc_float:
      collapsed: false
      smooth_scroll: false

---

<head>

<style>
@import url('https://fonts.googleapis.com/css2?family=Open+Sans&display=swap');
</style>

</head>

```{r loadcharts, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)

source('scripts/cleaning_and_management/libraries.R') 

expanel <- readRDS("data/Delphi survey/expanel.rds") 
```

# Introduction 

This document describes the files, information and codebook to replicate the analysis of the paper "Ask the experts? A Delphi survey of immigration to the European Union in 2030".

Data for this paper come from two main sources. The first is an original Delphi survey to elicit expert estimates of future migration flows. The questionnaire used can be reviewed [here](https://publications.iom.int/system/files/pdf/assessing-immigration-scenarios-eu.pdf). Details on how the data was collected can be consulted [here in page 14](https://publications.iom.int/system/files/pdf/assessing-immigration-scenarios-eu.pdf). 

The second source was administrative data from Eurostat and Frontex. Details on the origin of the data can be consulted in page 15 [here](https://publications.iom.int/system/files/pdf/assessing-immigration-scenarios-eu.pdf). 

The scripts contained in this folder describe in length how the data were processed. 

# Folder organization 

1. Folder "data" has two additional subfolders with the Delphi survey collected by the authors and the migration data from Frontex and Eurostat.
    - Delphi survey 
        - expanel.rds
    - Migration data 
        - Detections_of_IBC_2019_09_04.xlsx
        - forecast.rds

2. Folder "output" has all the tables and figures produced in the analysis including the annex.
    - figures 
        - fig3_Relative_Likelihood.eps
        - fig4_Changes_Immigration.eps
        - Annex3_Variation_Convergence.eps
        - Annex4.eps
        - Annex5_Confidence.eps
        - Annex6_Convergence.eps
        - Annex7_Composition_Bottom_Prob.eps
        - Annex8_Composition_Top_Prob.eps
        - Annex10_Arima.eps
    - tables 
        - Table1.xlsx

3. Folder "scripts" has all scripts for replicating all analysis, figures and tables.
    - Folder "cleaning_and_management" has the R packages used for the analysis (libraries), the file to download and tidy up the migration data (migration_data_cleaning), and code to produce the split violin plots and formats of the charts.
        - libraries.R
        - migration_data_cleaning.R
        - split_violin_and_labels.R

    - Folder"css_style". CSS code for the html file with the readme and codebook.
        - style.css

    - Folder "figures" has the scripts to produce all figures. 
        - fig3_Relative_Likelihood.R
        - fig4_Changes_Immigration.R
        - Annex3_Variation_Convergence.R
        - Annex4.R
        - Annex5_Confidence.R
        - Annex6_Convergence.R
        - Annex7_Composition_Bottom_Prob.R
        - Annex8_Composition_Top_Prob.R
        - Annex10_Arima.R

    - Folder "tables" has the scripts to produce all tables 
        - table1_samples.R

    - Folder "calculations" has the scripts to produce all calculations in the text 
        - coefficient of variation.R
        - confidence in estimates by wave and type of flow.R

4. File "Readme and Codebook.html"
    - This file.
    
5. File "Readme and Codebook.RMD"
    - Script to produce the codebook.
    
6. File "Migration scenarios.Rproj"
    - R project file. 
    
7. File "readme.txt"
    - Description of files and folders.
    
# How to replicate the analysis and figures?

First open "Migration scenarios.Rproj" to make sure that you have the right working directories. Then check if you have the R packages installed. You can see what packages were used in the folder "scripts", and then "cleaning_and_management". Once you have completed those steps simply run the .R files in the folder "figures", "tables" and "calculations".

# Codebook

## Migration data

This dataset has 3 columns: 1) year in which the flow was recorded, 2) the value recorded, and 3) the type of flow recorded.

## Delphi survey

Please note that the dataset is organised in the "long" format organised by wave (2 rounds of Delphi survey), type of elicited data (probability of migration scenario occurring in 2030, estimated flow and level of confidence in the elicitation), type of migration flow and. If one wishes to summarise data, one should consider all this levels to have a correct calculation.

To return the data into the "original" wide format please use the code below. In this way each row is an individual expert (178 in total). 

```{r wide, echo = T}
# expanel %>% 
#  tidyr::pivot_wider(names_from = c(wave, type, variable, scenario_name), values_from = value)
```

This is the codebook in the long format. The analysis was performed in this format.

```{r delphicodebook, echo = FALSE}

var_label(expanel) <- list(
  
  type = "Type of elicited estimate: Migration flow, Probability of a migration scenario becoming true or Confidence in the elicited estimate.",
  
  variable = "Type of migration flow or probability of migration scenario becoming true: First-time asylum application, high-skilled migration, irregular border crossing, labour related migration, total migration, or probability of migration scenario becoming true.",
  
  scenario_name = "4 Future migration scenarios: 1) Unilateralism and Economic convergence, 2) Multilateralism and Economic Convergence, 3) Unilateralim and Economic divergence, 4) Multilateralism and Economic divergence.",
  
  stakeholder = "3 types of expert stakeholders: Scholars, Practitioner or Other. Single selection.",
  
  experience = "Experience in methodologies estimating future migration: Forecasting, Scenario building, Migration drivers, Migration from a world region, Other. Multiple selection.", 
  
  academic = "Expert academic background: Political science, Sociology, Demography, Economics, Law, Psycology. Multiple selection",
  
  resid = "Country of residency of the expert. Open text.",
  
  regions = "Regions of expertise: Africa, Americas, Asia, Europe, Oceania. Multiple selection.",
  
  scholar_stake = "Dummy variable indicating if expert reported being a scholar.",
  
  practitioner_stake = "Dummy variable indicating if expert reported being a practitioner.",
  
  other_stake = "Dummy variable indicating if expert reported 'Other' as stakeholder.",
  
  migdri_exp = "Dummy variable indicating if expert reported having methodological experience in migration drivers.",
  
  migfor_exp = "Dummy variable indicating if expert reported having methodological experience in forecasting.",
  
  migreg_exp = "Dummy variable indicating if expert reported having methodological experience in migration from a world region",
  
  migsce_exp = "Dummy variable indicating if expert reported having methodological experience in migration scenarios.",
  
  migoth_exp = "Dummy variable indicating if expert reported having other type of methodological experience.",
  
  polsci_aca = "Dummy variable indicating if expert reported having an academic background in Political Science.",
  
  sociol_aca = "Dummy variable indicating if expert reported having an academic background in Sociology.",
  
  demogr_aca = "Dummy variable indicating if expert reported having an academic background in Demography.",
  
  econom_aca = "Dummy variable indicating if expert reported having an academic background in Economics.",
  
  lawlaw_aca = "Dummy variable indicating if expert reported having an academic background in Law.",
  
  psicho_aca = "Dummy variable indicating if expert reported having an academic background in Psychology.",
  
  other_aca = "Dummy variable indicating if expert reported having another academic background.",
  
  years_cat = "Years of experience of experts in 5 age groups." ,
  
  africa_regexp =  "Dummy indicating if expert reported having expertise in migration in Africa." ,
  
  americas_regexp = "Dummy indicating if expert reported having expertise in migration in the Americas." ,
  
  asia_regexp  = "Dummy indicating if expert reported having expertise in migration in Asia." ,
  
  europe_regexp = "Dummy indicating if expert reported having expertise in migration in Europe.",
  
  oceania_regexp = "Dummy indicating if expert reported having expertise in migration in Oceania.",
  
  europe = "Dummy indicating if expert reported having expertise in migration in Europe.",
  
  exp = "Dummy indicating if expert has at least 5 years of experience in migration research.",
  
  panel = "Dummy indicating if expert completed both round 1 and round 2.",
  
  wave = "Round 1 or round 2 of the Delphi survey",
  
  value = "Estimated value of the expert. The value can be a flow, probability of scenario becoming real or the confidence of the estimate.",
  
  analised_sample =  "Dummy indicating if expert completed round 1 and 2, if has at least 5 years of experience and if has expertise in European migration.",
  
  yearexp = "Years of experience of experts in 8 age groups." ,
  
  id = "Unique expert id.")


datatable(
  map_df(expanel, function(x) attributes(x)$label) %>% 
  pivot_longer(everything()) %>% 
    dplyr::select(Variable=name, 
                  Description=value))


```


